Method and systems for ordered dynamic distribution of packet flows over network processing means

ABSTRACT

A method and systems for dynamically distributing packet flows over multiple network processing means and recombining packet flows after processing while keeping packet order even for traffic wherein an individual flow exceeds the performance capabilities of a single network processing means is disclosed. After incoming packets have been analyzed to identify the flow the packets are parts of, the sequenced load balancer of the invention dynamically distributes packets to the connected independent network processors. A balance history is created per flow and updated each time a packet of the flow is received and/or transmitted. Each balance history memorizes, in time order, the identifier of network processor having handled packets of the flow and the associated number of processed packets. Processed packets are then transmitted back to a high-speed link or memorized to be transmitted back to the high-speed link later, depending upon the current status of the balance history.

FIELD OF THE INVENTION

The present invention relates to the field of network processing wherepackets traversing a packet switching network are analyzed by networkprocessing means, and more specifically to a method and systems fordynamically distributing packet flows over multiple network processingmeans and recombining packet flows after processing while keeping packetorder, even for traffic wherein an individual flow exceeds theperformance capabilities of a single network processing means.

BACKGROUND OF THE INVENTION

In network processing systems, packets traversing switching network aregenerally analyzed by network processors that execute functions on thepackets including routing, segmentation and re-assembly, filtering andvirus scanning, to increase performance, security and service quality.However, due to the increasing complexity of operation types thatnetwork processors may be required to execute on packets, and theincreasing rate of bandwidth and packet rate transmission increase inrelation to the rate of increase of network processor processing power,it is essential for devices and methods to increase the overallprocessing performance of network processors accordingly.

A common method for achieving higher processing performance than asingle processor or network processor can provide consists in parallelprocessing, where multiple processors operate in parallel. Such multipleprocessors may be considered as a single network processor of higherspeed.

In the context of network processing, parallel processing has, in priorart, been implemented as load balancing, or channel striping. Prior artchannel striping (also known as load sharing or inverse multiplexing) isfrequently used in networking because of processing bottlenecks orsimply because of price/performance ratio. In that scheme, a Round-RobinAlgorithm or a Load Sharing Algorithm is used that stripes the packetsbelonging to a stream across multiple channels. A major problem withstriping is that packets may be mis-ordered due to different delays ondifferent channels and due to different packet sizes. Three types ofsolutions for this mis-ordering problem are known in the prior art:

-   -   i) keeping each flow on only one channel and accepting that a        single flow cannot use more bandwidth than each channel can        support,    -   ii) reordering the received packets after mis-ordering and        accept the resulting waste of processing bandwidth, and    -   iii) splitting packets up into fixed transfer units which the        network processing means can process in a predictable period of        time.

Dynamic load balancing, on the other hand, is commonly used in the fieldof computational parallel processing, dealing with three generalcomputing entities: computations, tasks and data. In these cases,dynamic load balancing tries to find the mapping of computations, tasksor data, to computers that results in each computer having anapproximately equal amount of work in order to reduce run time andincrease the overall efficiency of a computation.

U.S. patent application Ser. No. 09/551,049 assigned to IBM Corporationand filed before the United States Patent and Trademark Office on Apr.18, 2000, describes a real-time load-balancing system for distributing asequence of incoming data packets emanating from a high speedcommunication line to a plurality of processing means, each operating ata capacity that is lower than the capacity of the high speedcommunication line. The system comprises parser means capable ofextracting a configurable set of classifier bits from the incomingpackets for feeding into compression means. The compression means arecapable of reducing a bit pattern of length K to a bit pattern having alength L which is a fraction of K. This system further comprises apipeline block for delaying incoming packets until a load balancingdecision is found, and an inverse demultiplexer for receiving a portidentifier output from said compression means as selector and fordirecting pipelined packets to the appropriate output port.

However, there is still a need for preserving the correct sequencing offlows, particularly for traffic wherein an individual flow exceeds theperformance capability of a single network processor. Orderedrecombination of packet flows is straightforward if the packets can bemodified. An obvious method would be to label each incoming packet witha sequence number, and to only prevent output packets from exiting innon-sequential order. However, the disadvantage of packet modificationis that the individual network processors must be configured differentlyin an aggregated configuration than in single network processorconfiguration, to correctly process the modified packets.

If such a need is requested by current technical performances of networkprocessing means, it also allows reuse of previous generations ofnetwork processing means by merging their performances to reach thedesired one and thus, to optimize cost of such network processing means.

SUMMARY OF THE INVENTION

Thus, it is a broad object of the invention to remedy the shortcomingsof the prior art as described here above.

It is another object of the invention to provide a method and systemsfor dynamically distributing packet flows over multiple networkprocessing means and recombining packet flows after processing whilekeeping packet order even for traffic wherein an individual flow exceedsthe performance capabilities of a single network processing means,without modifying the packets or modifying the operation of each singlenetwork processing means.

It is a further object of the invention to provide a method and systemsfor dynamically distributing packet flows over multiple networkprocessing means having different processing powers and recombiningpacket flows after processing while keeping packet order even fortraffic wherein an individual flow exceeds the performance capabilitiesof a single network processing means.

The accomplishment of these and other related objects is achieved by amethod for ordered dynamic distribution of packet flows from ahigh-speed link over network processing means that comprises the stepsof:

-   -   parsing the header of an incoming packet to extract flow        identifier;    -   creating a balance history associated to said flow identifier if        it does not exist;    -   analyzing network processing means loading and setting a current        network processing means;        -   if the identifier of said current network processing means            is different than the one having previously processed at            least one packet of the same flow, memorizing said            identifier of said current network processing means in said            balance history and setting to one the number of packets            processed by said current network processing means in said            balance history;        -   else if the identifier of said current network processing            means is the same as the one having previously processed at            least one packet of the same flow, increasing the number of            packets processed by said current network processing means            in said balance history; and,    -   routing said incoming packet to said current network processing        means,    -   and by a method to recombine packets processed by a plurality of        network processing means according to the method as described        above, comprising the steps of:    -   parsing the header of a processed packet to extract flow        identifier;    -   getting the earliest network processing means identifier and        associated number of processed packets from the balance history        related to said flow identifier;        -   if said processed packet has not been processed by the            earliest network processing means, it is stored in a packet            memory;        -   else if said processed packet has been processed by the            earliest network processing means, said processed packet is            transmitted to the high-speed link, said associated number            of processed packets is decremented and, if said associated            number of processed packets reaches zero, earliest network            processing means identifier changes to the next in said            balance history and packets queued in said packet memory            corresponding to the new earliest network processing means            identifier may be transmitted to the high-speed link and            then removed from said packet memory.

Further advantages of the present invention will become apparent to theones skilled in the art upon examination of the drawings and detaileddescription. It is intended that any additional advantages beincorporated herein.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts a network processing system illustrating the use of asequenced load balancer according to the invention.

FIG. 2 illustrates an architecture example of the system of theinvention for dynamically distributing packet flows over networkprocessing means.

FIG. 3 illustrates the content of balance history memory.

FIG. 4 describes a network switching system comprising processingsystems based on sequenced load balancers according to the invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

FIG. 1 depicts a network processing system illustrating the use of asequenced load balancer according to the invention to distribute thetraffic flows of an aggregate high-speed link onto a multiplicity ofindependent network processing entities. In this example, the networkprocessing system comprises a sequenced load balancer device 100according to the invention and four independent network processors A toD, referred to as 110-1 to 110-4, respectively. It is to be understoodthat sequenced load balancer device 100 may be connected to more or lessthan four network processors, this system scheme being described forsake of illustration. A high-speed link consisting in incoming andoutgoing ports (generically referred to as 120 and 130 respectively)transmits/receives packet flows to/from sequenced load balancer device100 so as to exchange data with a network or a network device such as aswitching fabric, not represented for sake of clarity. Incoming andoutgoing ports 120-i and 130-j (i=1, . . . , m, j=1, . . . , n) mayhandle similar or different types of data. Likewise, incoming andoutgoing ports 120-i and 130-j may be connected to different types ofnetworks or network devices depending upon requested data processing,e.g. routing or filtering. The number of incoming ports 120 may be equalto the number of outgoing ports 130 (n=m) or different (n≠m). The systemfurther comprises connections 140-k and 150-k (k=1, . . . , 4) toexchange packets between sequenced load balancer device 100 andindependent network processor k.

Network processing system illustrated on FIG. 1 allows to process packetflows of a high-speed link with independent network processors havingpacket processing rates being less than the packet transmission rate ofthe high-speed link. To that end, sequenced load balancer device 100analyzes the incoming packet flows and dynamically distributes packetsto the connected independent network processors. After processing,packets are recombined in sequenced load balancer 100 so as to betransmitted back to the high-speed link, preserving ordering of allpacket flows.

According to the method of the invention, each incoming packet isanalyzed so as to determine corresponding flow identifier and a networkprocessing means is assigned to process this incoming packet accordingto network processing means load. A balance history is created per flowto memorize the sequence of used network processing means and thecorresponding number of processed packets. When packets processed by anetwork processing means are transmitted back to the high-speed link,the identifier of this network processing means and the associatednumber of processed packets are removed from the balance history. Thus,the algorithm that handles incoming data comprises the steps of:

-   -   parsing the header of an incoming packet to extract a flow        identifier;    -   hashing the extracted flow identifier to generate a different        identifier, referred to as a flow bucket identifier;    -   determining the current network processing identifier by        analyzing the load of network processing means;        -   if the current network processing identifier is the same as            the one having previously processed at least one packet of            the same flow, increasing the number of packets being            processed by the current network processing means recorded            in the balance history memory;        -   else if the current network processing identifier is            different than the one having previously processed at least            one packet of the same flow, memorizing the current network            processing identifier in the balance history memory and            setting to one the number of packets processed by the            current network processing means; and,    -   routing packet to the current network processing means.

After incoming packets have been processed, it is necessary to recombinethem before transmitting them back to the high-speed link. Thus, eachprocessed packet is analyzed so as to determine corresponding flowidentifier and the corresponding balance history is used to respectpacket ordering. Processed packets may be bufferized if earlier packetsof the same flow have not been yet processed. After packets have beentransmitted back to the high-speed link, balance history is updated andthe buffer is released. The algorithm that handles processed datacomprises the steps of:

-   -   parsing the header of a processed packet to extract the flow        identifier;    -   hashing the extracted flow identifier to generate the        corresponding flow bucket identifier;    -   getting the earliest network processing means identifier and        associated number of processed packets for that flow bucket from        the balance history memory;        -   if packet has not been processed by the earliest network            processing means recorded for its flow bucket, it is stored            in a packet memory;        -   else if packet has been processed by the earliest network            processing means, it is transmitted to the high-speed link,            the associated number of processed packets is decremented            and, if this associated number of processed packets reaches            zero, earliest network processing means identifier changes            to the next in the balance history and packets queued in            packet memory corresponding to the new earliest network            processing means identifier may be transmitted to the            high-speed link and then removed from the packet memory.

FIG. 2 illustrates the architecture of a sequenced load balancer device100 for dynamically distributing packet flows over network processingmeans according to the method described above. Sequenced load balancerdevice 100 comprises a high-speed link consisting of incoming andoutgoing ports 120-i (i=1, . . . , m) and 130-j (j=1, . . . , n),respectively, and connections 140-1 to 140-4 and 150-1 to 150-4 totransmit/receive packets to/from network processors 110-1 to 110-4,respectively. Again, it is to be understood that sequenced load balancerdevice 100 may be connected to more or less than four networkprocessors, this implementation example being described for sake ofillustration.

The receive side of sequenced load balancer device 100 consists of TimeDivision Multiplexing (TDM) unit 200, header parsing unit 205, hashfunction unit 210, balance history memory 215, demultiplexer unit 220,pipeline unit 225, FIFO memory units 230-1 to 230-4 and current networkprocessor determining unit 235. Packets arriving on the incoming ports120 of the high-speed link are combined in TDM unit 200 according tostandard TDM algorithm and then examined by sequenced load balancerdevice 100. Each packet is routed to one of the several networkprocessors 110-1 to 110-4, based on the packet type, which flow thepacket is part of, and the amount of current load on each of the networkprocessors 110-1 to 110-4. Incoming packet headers are parsed in headerparsing unit 205 so as to extract flow identifiers that are transmittedto hash function unit 210 wherein flow identifiers are hashed togenerate different identifiers, referred to as flow bucket identifiers.Hash function unit 210 ascertains that packets belonging to a same floware identified with identical flow bucket. In general, the hashingfunction will be configured such that the number of flow bucketidentifiers is significantly less than the number of possible flowidentifiers, but this is not a requirement of the invention.

The parser unit 205 should be capable of providing flexibility in termsof number and types of extracted flow identifiers and of ensuring abroad applicability for various protocols. A preferred embodiment of theparser unit 205 is a re-configurable finite state machine device.

Flow buckets are used as indexes in balance history memory 215 wherein acurrent network processor identifier is assigned to current packet. Theidentifiers of all the network processors having handled packets of asame flow and the associated number of packets processed are stored, inprocessing time order, with associated flow bucket in balance historymemory 215.

FIG. 3 illustrates the content of balance history memory 215 whereineach row of the table represents the processing history of a flow. Firstcolumn, referred to as 300, identifies the flows by means of flowbuckets as discussed above, while other columns represent networkprocessor identifier and associated number of processed packets,referred to as 305-i and 310-i respectively. Thus, column 305-1represents the last current network processor identifier for each activeflow and column 310-1 the associated number of processed packets, column305-2 represents the previous current network processor identifier foreach active flow and column 310-2 the associated number of processedpackets, and so on. An active flow bucket is defined as a flow bucketwherein at least one packet has not been transmitted back to the outputport 130. In the example shown in FIG. 3, 125 packets belonging to flowhaving a flow bucket equal to 6302 are being processed by currentnetwork processor having identifier equal to 1 and 265 packets of thisflow bucket have been previously processed by network processor havingidentifier equal to 2.

Now turning back to FIG. 2, current network processor identifier istransmitted to demultiplexer unit 220 so that current received packetbeing delayed in pipeline unit 225 is stored in the FIFO memory (230-1to 230-4) corresponding to current network processor identifier, fromwhich it is transmitted to the current network processor. Currentnetwork processor determining unit 235 set current network processor bydetermining activity of network processors 110-1 to 110-4 through loadanalysis of FIFO memories 230-1 to 230-4 by means of standardstatistical techniques. For example, a network processor may be set ascurrent network processor each time the loading of its associated FIFOmemory is less than a predetermined threshold. Another example ofdetermining current network processor consists in selecting the networkprocessor that associated FIFO memory is just about to be empty, in suchcase the sequenced load balancer will not be switching between networkprocessors very often. It is also possible to use flow histories tooptimize selection of current network processor.

It is to be noticed that sequenced load balancer 100 may be connected tonetwork processors having different processing powers without anymodification since even if a network processor having a greaterprocessing power than others empties its associated FIFO memory faster,sequenced load balancer 100 selects it more often as current networkprocessor because current network processor is determined according toFIFO memory loading. Another solution requiring modification ofsequenced load balancer 100 consists in storing the processing power ofeach network processor in current network processor determining unit 235with associated network processor identifier. Processing power andassociated network processor identifier are used in conjunction withFIFO memory (230-1 to 230-4) load to determine current network processorso as to optimize loading of network processing means.

The transmit side of sequenced load balancer device 100 consists ofmultiplexer unit 240, pipeline unit 245, demultiplexer unit 250, packetmemory 255, multiplexer unit 260 and switching unit 285. After packetshave been processed in network processors 110-1 to 110-4, they aretransmitted to pipeline unit 245 through multiplexer unit 240 and thento demultiplexer unit 250. Depending upon packet flow status, incomingpackets are stored in packet memory 255 or transmitted to multiplexerunit 260 to be outputted through switching unit 285. Switching unit 285analyzes packet headers to determine outgoing port 130-j of high-speedlink to which packets have to be sent.

Sequenced load balancer device 100 further comprises a data flow controlto recombine packets after processing, comprising header parsing unit205, hash function unit 210, balance history memory 215, packet queueand dequeue unit 265, update history unit 270, update queue pointer unit275 and queue pointer memory 280. It is to be noticed that headerparsing unit 205, hash function unit 210 and balance history memory 215are used to analyze packets before and after processing. After packetshave been processed in network processors 110-1 to 110-4, they aretransmitted to header parsing unit 205 through multiplexer unit 240.Processed packet headers are parsed so as to extract flow identifiersthat are transmitted to hash function unit 210 wherein flow identifiersare hashed to generate flow buckets.

Flow bucket identifier is used as an index in balance history memory 215to access balance history that is used by packet queue and dequeue unit265 to determine whether a processed packet has to be transferred tooutgoing port 130-j or needs to be memorized in packet memory 255. Flowbucket identifier is also used as an index to store packet pointer inqueue pointer memory 280 when a processed packet needs to be stored inor retrieved from packet memory 255.

Packet queue and dequeue unit 265 analyzes balance history received frombalance history memory 215 to compare the identifier of networkprocessor having processed current packet with the one of the earliestnetwork processor of the flow bucket of which current packet is part. Ifthey are not equal, current processed packet is stored in packet memory255 and corresponding pointer is stored in queue pointer memory 280according to current processed packet flow bucket and the identifier ofthe network processor having processed current processed packet. Ifidentifiers are equal, current processed packet is directly transmittedto outgoing port 130-j and the packet queue and dequeue unit 265decreases the number of processed packets associated to the earliestnetwork processor identifier through update history unit 270. If thisnumber reaches zero, the identifier of the earliest network processingmeans is also updated, it is set to the next in the balance history andpackets queued in packet memory corresponding to the new earliestnetwork processing means identifier may be transmitted to the high-speedlink and then removed from the packet memory.

It is to be noticed that if the number of incoming ports 120 is equal toone, then Time Division Multiplexing unit 200 is not required. Likewise,if the number of outgoing ports 130 is equal to one, switching unit 285is not required.

FIG. 4 illustrates the use of several sequenced load balancers 100, asdescribed above, in a high-end switching or router system 400. Suchsystem typically comprises at least one switching fabric 410, having aplurality of high-speed links, e.g. 64×64 port switch with each portcapable of sustaining a full duplex 40 Gb/s traffic. In this example,four network processors are required to handle a single half duplexhigh-speed link, i.e. an incoming or outgoing port. Thus, switchingsystem 400 comprises as many sequenced load balancers, genericallyreferred to as 100-r, as half duplex links wherein each sequenced loadbalancer 100-r connects four network processors, NP Ar, NP Br, NP Cr andNP Dr. The shape of sequenced load balancers is shown in “U” shape inorder to simplify the overall drawing and to clarify how the aggregationof sequenced load balancers and network processing means appears to therest of the system as a single network processing means of higher power.

To illustrate the behavior of the system presented on FIG. 4, let usconsider a particular packet flow received in sequenced load balancer100-1 through incoming port 120-s. Packets of this flow are dynamicallydistributed over network processors NP A1, NP B1, NP C1 and NP D1,according to the method of the invention, to be processed. Then, packetsare recombined in sequenced load balancer 100-1, keeping packet order,to be transmitted to switching fabric 410 through outgoing port 130-t.Packets are routed in switching fabric 410 and transmitted, for example,to incoming port 120-u of sequenced load balancer 100-2. Again, packetsare dynamically distributed over network processors NP A2, NP B2, NP C2and NP D2 of sequenced load balancer 100-2, still according to themethod of the invention, to be processed before being recombined insequenced load balancer 100-2, still keeping packet order, to betransmitted to a network or network device (not represented) throughoutgoing port 130-v.

While the invention has been described in term of preferred embodiments,those skilled in the art will recognize that the invention can beimplemented differently. Likewise, in order to satisfy local andspecific requirements, a person skilled in the art may apply to thesolution described above many modifications and alterations all ofwhich, however, are included within the scope of protection of theinvention as defined by the following claims.

1. A method for ordered dynamic distribution of packet flows from ahigh-speed link over network processors that comprises the steps of:parsing the header of an incoming packet to extract a flow identifier;creating a balance history associated with said flow identifier;determining a current network processor by analyzing network processorloading; if an identifier of said current network processor is differentthan the one having previously processed at least one packet of the sameflow, storing said identifier of said current network processor in saidbalance history and setting to one the number of packets processed bysaid current network processor in said balance history; else if saididentifier of said current network processor is the same as the onehaving previously processed at least one packet of the same flow,increasing the number of packets processed by said current networkprocessor in said balance history; and, routing said incoming packet tosaid current network processor.
 2. The method of claim 1 wherein saidstep of routing said incoming packet to said current network processorcomprises: storing said incoming packet into a memory associated withsaid current network processor; and, transmitting packets stored in saidmemory associated with said network processor upon request of saidnetwork processor.
 3. The method of claim 2 wherein said step ofanalyzing network processor loading and setting a current networkprocessor comprises: analyzing the loading of said memory associated tosaid network processor; and, setting said current network processoraccording to the loading of said memory associated to said networkprocessor.
 4. The method of of claim 1 further comprising the step oftime domain multiplexing incoming packets received through amultiplicity of incoming ports.
 5. A method to recombine packetsprocessed by a plurality of network processors according to the methodas recited in claim 1, comprising: getting the earliest networkprocessor identifier and associated number of processed packets from thebalance history related to said flow identifier; if said processedpacket has not been processed by the earliest network processor, it isstored in a packet memory; else if said processed packet has beenprocessed by the earliest network processor, said processed packet istransmitted to the high-speed link, said associated number of processedpackets is decremented and, if said associated number of processedpackets reaches zero, said earliest network processor identifier changesto the next network processor identifier in said balance history andpackets queued in said packet memory corresponding to the new earliestnetwork processor identifier may be transmitted to the high-speed linkand then removed from said packet memory.
 6. The method of claim 5further comprising the step of switching processed packets over aplurality of outgoing ports according to packet headers.
 7. The methodof claim 1 further comprising the step of hashing said extracted flowidentifiers to generate other identifiers such that said otheridentifiers are identical for packets belonging to a same flow.
 8. Themethod of claim 7 wherein said other identifiers are smaller than saidextracted flow identifiers.
 9. The method of claim 8 wherein said stepof hashing said extracted flow identifier to generate a smalleridentifier is based on a non-linear hashing function.
 10. A sequencedload balancer for ordered dynamic distribution of packet flows from ahigh-speed link over network processors, said load balancer comprising:a parser for parsing the header of an incoming packet to extract a flowidentifier; a processor for creating a balance history with said flowidentifier; a processor for determining a current network processor byanalyzing network processor loading; a processor for, if an identifierof said current network processor is different than the one havingpreviously processed at least one packet of the same flow, storing saididentifier of said current network processor in said balance history andsetting to one the number of packets processed by said current networkprocessor in said balance history, said processor also for, else if theidentifier of said current network processor is the same as the onehaving previously processed at least one packet of the same flow,increasing the number of packets processed by said current networkprocessor in said balance history; and a router for routing saidincoming packet to said current network processor.